Skip to content

Conversation

@Binyang2014
Copy link
Contributor

@Binyang2014 Binyang2014 commented Nov 19, 2025

Reorganize current native algorithm implementation and DSL algorithm implementation.
Provide unified API for DSL algo and native algo and provide interface to tune the algo
Provide interface for pytorch integration with native API and DSL

Copilot AI and others added 2 commits January 15, 2026 09:46
…d style consistency (#725)

- [x] Fix license text: "MIT license" → "MIT License" in multiple files
- [x] Rename files with "_2" suffix and update references
- [x] Add missing license headers
- [x] Fix header guards to follow MSCCLPP_EXT_<FILE_NAME>_HPP_ format
- [x] Fix enum naming consistency
  - [x] CommResult enum to CamelCase
- [x] CollectiveBufferMode enum to CamelCase (Any, InPlace, OutOfPlace)
  - [x] AlgorithmType enum to CamelCase (Native, DSL)
- [x] Fix comment in src/core/gpu_utils.cc
- [x] Fix LogSubsys::COUNT case in src/core/include/logger.hpp
  - [x] Add explanatory comment
  - [x] Add [[fallthrough]] attribute
- [x] Apply clang-format
- [x] Remove _codeql_detected_source_root file and add to .gitignore
- [x] Update documentation paths
  - [x] Fix NCCL library paths: build/apps/nccl/ → build/lib/
  - [x] Fix test binary paths: ./test/ → ./bin/

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 We'd love your input! Share your thoughts on Copilot coding agent in
our [2 minute survey](https://gh.io/copilot-coding-agent-survey).

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: chhwang <[email protected]>
@Binyang2014
Copy link
Contributor Author

/azp run mscclpp-ut

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Binyang2014
Copy link
Contributor Author

/azp run mscclpp-ut

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Binyang2014
Copy link
Contributor Author

/azp run mscclpp-ut

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Binyang2014
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@Binyang2014
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@Binyang2014
Copy link
Contributor Author

/azp run mscclpp-ut

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Binyang2014 Binyang2014 requested a review from chhwang January 18, 2026 17:25
Copy link
Contributor

@chhwang chhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's tackle comments in another PR

__all__ += algorithm.__all__
__all__ += comm.__all__
__all__ += compiler.__all__
__all__ += buffer.__all__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unneeded

from mscclpp._mscclpp import (
Algorithm as _Algorithm,
DslAlgorithm as _DslAlgorithm,
AlgorithmType as _AlgorithmType,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python-binded objects would be named differently in the future (e.g., adding Cpp* prefix) to avoid confusion

)
).hexdigest()

plan_dir = os.environ.get("MSCCLPP_EXECUTION_PLAN_DIR", Path.home() / ".cache/mscclpp")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. All env variables should be collected and documented in env.cpp, even if it's used only at Python level.
  2. Is execution plan a part of "cache", meaning, this belongs to the system not to the user?
  3. I'd prefer MSCCLPP_CACHE_DIR env and we internally decide what to put where.

recompilation.
The cache location can be configured via the `MSCCLPP_NATIVE_CACHE_DIR`
environment variable (defaults to `~/.cache/mscclpp/native`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer MSCCLPP_CACHE_DIR env and we internally decide what to put where.

rocm_home = os.environ.get("ROCM_HOME")
return os.path.join(rocm_home, "bin/hipcc") if rocm_home else "hipcc"
else:
cuda_home = os.environ.get("CUDA_HOME")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not MSCCLPP_CUDA_HOME/MSCCLPP_ROCM_HOME?


from .algorithm_collection_builder import *

__all__ = algorithm_collection_builder.__all__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unneeded

break;
case DataType::FP8_E4M3:
case DataType::FP8_E5M2:
// FP8 is not supported in CUDA execution kernel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we warn or throw here?

@Binyang2014
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@Binyang2014
Copy link
Contributor Author

Will resolve all the comments in another PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants